Stochastic F0 contour model based on the clustering of F0 shapes of a syntactic unit
نویسندگان
چکیده
This paper describes a stochastic modeling between an F0 contour and linguistic features of a sentence for speech synthesis. The F0 contour of a sentence is represented by concatenation of the F0 patterns of a Japanese syntactic unit, bunsetsu. A bunsetsu F0 pattern is composed of the F0 average and the F0 shape. The F0 average is independently predicted for each bunsetsu by a quantification theory from linguistic features of the bunsetsu. The most probable sequence of bunsetsu F0 shapes for a sentence are found in the F0 shape database by a probabilistic measure. The probability that an F0 contour is observed for a sentence is defined by two kinds of probabilities, the F0 shape production and the F0 shape bigram. The latter is a probability of adjacent occurrence of two F0 shapes, like a word bigram in speech recognition. Several typical bunsetsu F0 shapes are extracted by clustering of training data and stored in the F0 shape database. The probability of the F0 shape production is computed for each bunsetsu based on the distribution of linguistic features in the cluster.
منابع مشابه
Generation of F0 contour using stochastic mapping and vector quantization control parameters
This paper introduces an F0 contour generation method for text-to-speech synthesis using stochastic mapping and vector quantization control parameters. This model uses a new F0 contour labelling scheme based on the RFC (Rise/Fall/Connection) model [1], which describes F0 contour patterns with seven F0 labels and three pause labels. This paper also suggests an e cient selection method for contro...
متن کاملGenerating F0 contours by statistical manipulation of natural F0 shapes
This paper proposes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of F0 units are basically kept unchanged, by eliminating any averaging operation in the analysis phase and minimizing modification operations in the synthesis phase. The use of “kept-unchanged” F0 shapes has a great potential to incorporate a wide variety of speakin...
متن کاملSmooth contour estimation in data-driven pitch modelling
Apple's next-generation text-to-speech (TTS) system in MacOS X uses a superpositional pitch model, comprising a relatively smooth underlying F0 contour and a separate contribution from the in uence of the phonetic segments. This paper focuses on the data-driven modelling of the underlying contour, based on electroglottographic signals obtained from a corpus of reiterant speech. F0 extraction fr...
متن کاملHidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech
This paper proposes a stochastic model of speech F0 contours, based on the stochastic formulation of the Fujisaki model. Our motivation for the stochastic formulation is twofold. Firstly, it allows us to derive a well-behaved algorithm for estimating the Fujisaki model parameters from a raw F0 contour. Secondly, it will open the door to incorporating the well-founded F0 contour model into vario...
متن کاملAn F0 Contour Model in Chinese Based on Templates of Prosodic Words
The problem of F0 contour generation in Chinese are addressed in this paper. An F0 contour model based on templates of prosodic words is proposed. Taking templates of prosodic word F0 contour as the basic units, the basic structure of the model is established with references to the “small ripples on top of big waves theory” and “Fujisaki model”. A three-layer prosodic hierarchy which consists o...
متن کامل